Chi2: feature selection and discretization of numeric attributes

نویسندگان

  • Huan Liu
  • Rudy Setiono
چکیده

Discretization can turn numeric attributes into discrete ones. Feature selection can eliminate some irrelevant attributes. This paper describes Chi2, a simple and general algorithm that uses the 2 statistic to discretize numeric attributes repeatedly until some inconsistencies are found in the data, and achieves feature selection via discretization. The empirical results demonstrate that Chi2 is eeective in feature selection and discretization of numeric and ordinal attributes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Selection via Discretization

| Discretization can turn numeric attributes into discrete ones. Feature selection can eliminate some irrelevant and/or redundant attributes. Chi2 is a simple and general algorithm that uses the 2 statistic to discretize numeric attributes repeatedly until some inconsistencies are found in the data. It achieves feature selection via dis-cretization. It can handle mixed attributes, work with mul...

متن کامل

Discovering Maximal Generalized Decision Rules Through Horixontal and Vertical Data Reduction

We present a method to learn maximal generalized decision rules from databases by integrating discretization, generalization and rough set feature selection. Our method reduces the data horizontally and vertically. In the first phase, discretization and generalization are integrated and the numeric attributes are discretized into a few intervals. The primitive values of symbolic attributes are ...

متن کامل

Basket Analysis on Meningitis Data

Basket Analysis is the most representative approach in recent study of data mining. However, it cannot be directly applied to the data including numeric data. In this paper, we claim the importance of the selection and the discretization of numeric attributes in the data preprocessing stage for the wider application of Basket Analysis, and propose the algorithm and the performance measures for ...

متن کامل

A Modified Chi2 Algorithm for Discretization

ÐSince the ChiMerge algorithm was first proposed by Kerber in 1992, it has become a widely used and discussed discretization method. The Chi2 algorithm is a modification to the ChiMerge method. It automates the discretization process by introducing an inconsistency rate as the stopping criterion and it automatically selects the significance value. In addition, it adds a finer phase aimed at fea...

متن کامل

A New Hybrid Framework for Filter based Feature Selection using Information Gain and Symmetric Uncertainty (TECHNICAL NOTE)

Feature selection is a pre-processing technique used for eliminating the irrelevant and redundant features which results in enhancing the performance of the classifiers. When a dataset contains more irrelevant and redundant features, it fails to increase the accuracy and also reduces the performance of the classifiers. To avoid them, this paper presents a new hybrid feature selection method usi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995